After completing this session, students should be able to:
Practice global Needleman Wunsch algorithm in their notebook
Align a sequence
Download R
Download and align SARS-CoV-2 sequences from the State of Bahia, Brazil
ALGORITHM
Steps:
Let’s see an example.
Align two sequences in the notebook
BLAST
Fragment of SARS-CoV-2 sequence to blast:
An algorithm can also called a pipeline.
The first step to develop an algorithm is to objectively explain how to answer a question or solve a problem.
A Variant Calling algorithm identifies Variants or Mutations in the genome of an organism.
Also called SNP calling pipeline.
Here is the pseudocode of a SNP calling pipeline:
Align to reference sequence (FASTA)
Compare alignment to reference (SAM)
Annotate differences (mutations) (VCF)
Extract mutations from VCF using script
Construct a SNP Frequency Table
Schematic of a SNP call pipeline
The blue boxes indicates the analysis being performed.
The text above the boxes indicates the software used for each analysis. Figure is from r-charts (n.d.).
Software development considers the analytical steps in human language
Then, the software product considers the steps the machine will execute
How files are produced and what are the processing steps?
Where in the computational infra-structure are the files stored?
We can develop our own computational methods to understand biology and propose solutions
In order to do that we need to follow these three steps for developing a computational algorithm that will solve a problem: